Goto

Collaborating Authors

 diffraction data


XDXD: End-to-end crystal structure determination with low resolution X-ray diffraction

Zhao, Jiale, Liu, Cong, Zhang, Yuxuan, Gong, Chengyue, Zhang, Zhenyi, Jin, Shifeng, Liu, Zhenyu

arXiv.org Artificial Intelligence

Determining crystal structures from X-ray diffraction data is fundamental across diverse scientific fields, yet remains a significant challenge when data is limited to low resolution. While recent deep learning models have made breakthroughs in solving the crystallographic phase problem, the resulting low-resolution electron density maps are often ambiguous and difficult to interpret. To overcome this critical bottleneck, we introduce XDXD, to our knowledge, the first end-to-end deep learning framework to determine a complete atomic model directly from low-resolution single-crystal X-ray diffraction data. Our diffusion-based generative model bypasses the need for manual map interpretation, producing chemically plausible crystal structures conditioned on the diffraction pattern. We demonstrate that XDXD achieves a 70.4\% match rate for structures with data limited to 2.0~Å resolution, with a root-mean-square error (RMSE) below 0.05. Evaluated on a benchmark of 24,000 experimental structures, our model proves to be robust and accurate. Furthermore, a case study on small peptides highlights the model's potential for extension to more complex systems, paving the way for automated structure solution in previously intractable cases.


opXRD: Open Experimental Powder X-ray Diffraction Database

Hollarek, Daniel, Schopmans, Henrik, Östreicher, Jona, Teufel, Jonas, Cao, Bin, Alwen, Adie, Schweidler, Simon, Singh, Mriganka, Kodalle, Tim, Hu, Hanlin, Heymans, Gregoire, Abdelsamie, Maged, Hardiagon, Arthur, Wieczorek, Alexander, Zhuk, Siarhei, Schwaiger, Ruth, Siol, Sebastian, Coudert, François-Xavier, Wolf, Moritz, Sutter-Fella, Carolin M., Breitung, Ben, Hodge, Andrea M., Zhang, Tong-yi, Friederich, Pascal

arXiv.org Artificial Intelligence

Powder X-ray diffraction (pXRD) experiments are a cornerstone for materials structure characterization. Despite their widespread application, analyzing pXRD diffractograms still presents a significant challenge to automation and a bottleneck in high-throughput discovery in self-driving labs. Machine learning promises to resolve this bottleneck by enabling automated powder diffraction analysis. A notable difficulty in applying machine learning to this domain is the lack of sufficiently sized experimental datasets, which has constrained researchers to train primarily on simulated data. However, models trained on simulated pXRD patterns showed limited generalization to experimental patterns, particularly for low-quality experimental patterns with high noise levels and elevated backgrounds. With the Open Experimental Powder X-Ray Diffraction Database (opXRD), we provide an openly available and easily accessible dataset of labeled and unlabeled experimental powder diffractograms. Labeled opXRD data can be used to evaluate the performance of models on experimental data and unlabeled opXRD data can help improve the performance of models on experimental data, e.g. through transfer learning methods. We collected 92552 diffractograms, 2179 of them labeled, from a wide spectrum of materials classes. We hope this ongoing effort can guide machine learning research toward fully automated analysis of pXRD data and thus enable future self-driving materials labs.


CrystalX: Ultra-Precision Crystal Structure Resolution and Error Correction Using Deep Learning

Zheng, Kaipeng, Huang, Weiran, Ouyang, Wanli, Zhong, Han-Sen, Li, Yuqiang

arXiv.org Artificial Intelligence

Atomic structure analysis of crystalline materials is a paramount endeavor in both chemical and material sciences. This sophisticated technique necessitates not only a solid foundation in crystallography but also a profound comprehension of the intricacies of the accompanying software, posing a significant challenge in meeting the rigorous daily demands. For the first time, we confront this challenge head-on by harnessing the power of deep learning for ultra-precise structural analysis at the full-atom level. To validate the performance of the model, named CrystalX, we employed a vast dataset comprising over 50,000 X-ray diffraction measurements derived from authentic experiments, demonstrating performance that is commensurate with human experts and adept at deciphering intricate geometric patterns. Remarkably, CrystalX revealed that even peer-reviewed publications can harbor errors that are stealthy to human scrutiny, yet CrystalX adeptly rectifies them. This deep learning model revolutionizes the time frame for crystal structure analysis, slashing it down to seconds. It has already been successfully applied in the structure analysis of newly discovered compounds in the latest research without human intervention. Overall, CrystalX marks the beginning of a new era in automating routine structural analysis within self-driving laboratories.


Deep-learning real-time phase retrieval of imperfect diffraction patterns from X-ray free-electron lasers

Lee, Sung Yun, Cho, Do Hyung, Jung, Chulho, Sung, Daeho, Nam, Daewoong, Kim, Sangsoo, Song, Changyong

arXiv.org Artificial Intelligence

Machine learning is attracting surging interest across nearly all scientific areas by enabling the analysis of large datasets and the extraction of scientific information from incomplete data. Data-driven science is rapidly growing, especially in X-ray methodologies, where advanced light sources and detection technologies accumulate vast amounts of data that exceed meticulous human inspection capabilities. Despite the increasing demands, the full application of machine learning has been hindered by the need for data-specific optimizations. In this study, we introduce a new deep-learning-based phase retrieval method for imperfect diffraction data. This method provides robust phase retrieval for simulated data and performs well on weak-signal single-pulse diffraction data from X-ray free-electron lasers. Moreover, the method significantly reduces data processing time, facilitating real-time image reconstructions that are crucial for high-repetition-rate data acquisition. Thus, this approach offers a reliable solution to the phase problem and is expected to be widely adopted across various research areas.


Three-Dimensional, Multimodal Synchrotron Data for Machine Learning Applications

Green, Calum, Ahmed, Sharif, Marathe, Shashidhara, Perera, Liam, Leonardi, Alberto, Gmyrek, Killian, Dini, Daniele, Houx, James Le

arXiv.org Artificial Intelligence

Machine learning techniques are being increasingly applied in medical and physical sciences across a variety of imaging modalities; however, an important issue when developing these tools is the availability of good quality training data. Here we present a unique, multimodal synchrotron dataset of a bespoke zinc-doped Zeolite 13X sample that can be used to develop advanced deep learning and data fusion pipelines. Multi-resolution micro X-ray computed tomography was performed on a zinc-doped Zeolite 13X fragment to characterise its pores and features, before spatially resolved X-ray diffraction computed tomography was carried out to characterise the homogeneous distribution of sodium and zinc phases. Zinc absorption was controlled to create a simple, spatially isolated, two-phase material. Both raw and processed data is available as a series of Zenodo entries. Altogether we present a spatially resolved, three-dimensional, multimodal, multi-resolution dataset that can be used for the development of machine learning techniques. Such techniques include development of super-resolution, multimodal data fusion, and 3D reconstruction algorithm development.


Deciphering diffuse scattering with machine learning and the equivariant foundation model: The case of molten FeO

Sivaraman, Ganesh, Benmore, Chris J.

arXiv.org Artificial Intelligence

Bridging the gap between diffuse x-ray or neutron scattering measurements and predicted structures derived from atom-atom pair potentials in disordered materials, has been a longstanding challenge in condensed matter physics. This perspective gives a brief overview of the traditional approaches employed over the past several decades. Namely, the use of approximate interatomic pair potentials that relate 3-dimensional structural models to the measured structure factor and its' associated pair distribution function. The use of machine learned interatomic potentials has grown in the past few years, and has been particularly successful in the cases of ionic and oxide systems. Recent advances in large scale sampling, along with a direct integration of scattering measurements into the model development, has provided improved agreement between experiments and large-scale models calculated with quantum mechanical accuracy. However, details of local polyhedral bonding and connectivity in meta-stable disordered systems still require improvement. Here we leverage MACE-MP-0; a newly introduced equivariant foundation model and validate the results against high-quality experimental scattering data for the case of molten iron(II) oxide (FeO). These preliminary results suggest that the emerging foundation model has the potential to surpass the traditional limitations of classical interatomic potentials.


Machine learning enhances X-ray imaging of nanotextures

AIHub

From Real-space imaging of polar and elastic nano-textures in thin films via inversion of diffraction data, reproduced under a CC BY 4.0 licence. Using a combination of high-powered X-rays, phase-retrieval algorithms and machine learning, researchers revealed the intricate nanotextures in thin-film materials, offering scientists a new, streamlined approach to analyzing potential candidates for quantum computing and microelectronics, among other applications. Scientists are especially interested in nanotextures that are distributed non-uniformly throughout a thin film because they can give the material novel properties. The most effective way to study the nanotextures is to visualize them directly, a challenge that typically requires complex electron microscopy and does not preserve the sample. The new imaging technique overcomes these challenges by using phase retrieval and machine learning to invert conventionally-collected X-ray diffraction data – such as that produced at the Cornell High Energy Synchrotron Source, where data for the study was collected – into real-space visualization of the material at the nanoscale.


A Deep Generative Approach to Oversampling in Ptychography

Barutcu, Semih, Katsaggelos, Aggelos K., Gürsoy, Doğa

arXiv.org Artificial Intelligence

Ptychography is a well-studied phase imaging method that makes non-invasive imaging possible at a nanometer scale. It has developed into a mainstream technique with various applications across a range of areas such as material science or the defense industry. One major drawback of ptychography is the long data acquisition time due to the high overlap requirement between adjacent illumination areas to achieve a reasonable reconstruction. Traditional approaches with reduced overlap between scanning areas result in reconstructions with artifacts. In this paper, we propose complementing sparsely acquired or undersampled data with data sampled from a deep generative network to satisfy the oversampling requirement in ptychography. Because the deep generative network is pre-trained and its output can be computed as we collect data, the experimental data and the time to acquire the data can be reduced. We validate the method by presenting the reconstruction quality compared to the previously proposed and traditional approaches and comment on the strengths and drawbacks of the proposed approach.